Improvements of search error risk minimization in viterbi beam search for speech recognition

نویسندگان

  • Takaaki Hori
  • Shinji Watanabe
  • Atsushi Nakamura
چکیده

This paper describes improvements in a search error risk minimization approach to fast beam search for speech recognition. In our previous work, we proposed this approach to reduce search errors by optimizing the pruning criterion. While conventional methods use heuristic criteria to prune hypotheses, our proposed method employs a pruning function that makes a more precise decision using rich features extracted from each hypothesis. The parameters of the function can be estimated to minimize a loss function based on the search error risk. In this paper, we improve this method by introducing a modified loss function, arc-averaged risk, which potentially has a higher correlation with actual error rate than the original one. We also investigate various combinations of features. Experimental results show that further search error reduction over the original method is obtained in a 100K-word vocabulary lecture speech transcription task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Various Robust Search Methods in a Hungarian Speech Recognition System

This work focuses on the search aspect of speech recognition. We describe some standard algorithms such as stack decoding, multi-stack decoding, the Viterbi beam search and an A* heuristic, then present improvements on these search methods. Finally we compare the performance of each algorithm, grading them according to their performance. We will show that our improvements can outperform the sta...

متن کامل

Aggregation Operators and Hypothesis Space Reductions in Speech Recognition

In this paper we deal with the heuristic exploration of general hypothesis spaces arising both in the HMM and segment-based approaches of speech recognition. The generated hypothesis space is a tree where we assign costs to its nodes. The tree and the costs are both generated in a top-down way where we have node extension rules and aggregation operators for the cost calculation. We introduce a ...

متن کامل

Backward Viterbi beam search for utilizing dynamic task complexity information

The backward Viterbi beam search has not received enough attentions other than being used in the second pass. The reason is that the speech recognition society has long ignored the concept of dynamic complexities of a speech recognition task which can help us to determine whether we should operate Viterbi decoding in forward or backward direction. We use the U.S. street address entry task as on...

متن کامل

Combining stochastic and linguistic language models for recognition of spontaneous speech

In this paper we present a new approach of combining sto-chastic language models and traditional linguistic models to enhance the performance of our spontaneous speech reco-gnizer. We compile arbitrary large linguistic context dependencies into a category based bigram model which allows us to use a standard beam-search driven forward Viterbi algorithm for real time decoding. Since this recogniz...

متن کامل

Realtime Viterbi Searching for Practical Telephone Speech Recognition Systems

This paper studies searching and pruning process of the telephone speech recognition system for Private Automatic Branch Exchange (PABX) to explore the possible problems encountered in applying speech recognition to telephone network and to prepare the necessary techniques for the practical telephone speech recognition systems. Experiment on a baseline system which uses semi-syllable based mult...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010